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Abstract 

The  adjoint  and  minimum  principle  for  a  partially  observed  diffusion  can 
be  obtained  by  differentiating  the  statement  that  a  control  u*  is  optimal. 
Using  stochastic  flows  the  variation  in  the  cost  resulting  from  a  change  in 
an  optimal  control  can  be  computed  explicitly.  The  technical  difficulty  is 
to  justify  the  differentiation. 
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1.  INTRODUCTION. 

Using  stochastic  flows  we  calculate  below  the  change  in  the  cost  due  to  a  ‘strong’ 
variation  of  an  optimal  control.  Differentiating  this  quantity  enables  us  to  identify  the 
adjoint,  or  co-state  variable,  and  give  a  partially  observed  minimum  principle.  If  the  drift 
coefficient  is  differentiable  in  the  control  variable  the  related  result  of  Bensoussan  [2]  follows 
from  our  theorem.  Full  details  will  appear  in  [l].  The  method  appears  simpler  than  that 
employed  in  Haussman  [4], 

2.  DYNAMICAL  EQUATIONS. 

Suppose  the  state  of  a  stochastic  system  is  described  by  the  equation 

d£t  =  /(*>  Ct,u)dt  +  g(t,  £t)dwt, 

Co  =  xo,  0  <  t  <  T.  (2.1) 

The  control  variable  u  will  take  values  in  a  compact  subset  U  of  some  Euclidean  space  RK  . 

We  shall  assume 

/lj :  x0  £  Rd  is  given. 

An :  /  :  (0,  T\  x  Rd  x  U  -  Rd  is  Borel  measurable,  continuous  in  u  for  each  (£,x), 

continuously  differentiable  in  x  for  each  ( t,u )  and 

(1  +  M)~  1  i/(Uuu)  |  +  |  fx  (t,x,u)|  <  A',  . 

A3:  g  :  [0,T\xRd  —  Rd®Rn  is  a  matrix  valued  function,  Borel  measurable,  continuously 
differentiable  in  x,  and  for  some  K2'- 

T  \gx  (UMI  ^  K2- 

Tlie  observation  process  is  defined  by 

dyt  -  KCt)dt  +  dut  (2.2) 

yt  6  Rm ,  y0  =  0,  0  <  t  <  T. 


o 


In  (2.1)  and  (2.2)  w  =  (ui 1 , . . . ,  wn  )  and  u  =  [ul  , . . . ,  um  )  are  independent  Brownian 
notions  defined  on  a  probability  space  ( Cl,F,P ). 

Furthermore,  we  assume 

A4 :  h  :  Rd  — ►  Rm  is  Borel  measurable,  continuously  differentiable  in  z  and 

|/i(f,z)|  +  \hx(t,x)\  <  K3. 

REMARK  2.1.  These  hypotheses  can  be  weakened  to  those  discussed  by  Hauss- 
man  [4].  See  111. 

Write  P  for  the  Wiener  measure  on  C([0,  T\,  Rn )  and  p  for  the  Wiener  measure  on 
C((0,T],Rm). 

n  -  c([o,r),/r)  x  c([o,r],/rn) 

and  the  coordinate  functions  in  Cl  will  be  denoted  (xt,yt).  Wiener  measure  P  on  fi  is 

P(dx,dy)  =  P(dx)p(dy). 

Definition  2.2.  Y  =  {F,}  will  be  the  right  continuous,  complete  filtration  on 
C([0,  T\,  Rm  )  generated  by 

Yt°  =  a{y3  :  s  <  t}. 

The  set  of  admissible  control  functions  U_  will  be  the  Y -predictable  functions  defined  on 
[0,T]  x  C([0,  T\,  R’n  )  with  values  in  U. 

For  u£(/  and  z  G  Rd ,  d;“f(z)  will  denote  the  strong  solution  of  (2.1)  corresponding 
to  u  with  £“  =  z. 

Define 

Z?,t(X)  =  eXP  h(C,r(x))'dyr  ~  h(tf  r(x))2dry  (2.3) 

Note  a  version  of  Z  defined  for  every  trajectory  y  can  be  obtained  by  integrating  the 
stochastic  integral  in  the  exponential  by  parts. 
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If  a  new  probability  measure  P“  defined  on  fl  by  putting 


7F  =  Z .V*). 

under  Pu  (£“  t  (xo)>  J/t )  "ls  a  solution  of  the  system  (2.1)  and  (2.2).  That  is,  under  Pu , 
^0  /(xo)  regains  a  strong  solution  of  (2.1)  and  there  is  an  independent  Brownian  motion 
v  such  that  yt  satisfies  (2.2). 

Because  of  hypothesis  A4,  for  0  <  t  <  T  easy  applications  of  Burkholder’s  and  Gron- 
wall’s  inequalities  show  that 

£|(Z0*,(*o)),’l  -  °°  (-> -I) 

for  all  u  £  (/  and  all  p,  1  <  p  <  oo. 

COST  2.3.  We  shall  suppose  the  cost  is  purely  terminal  and  equals 

c(C,r(xo)) 

where  c  is  a  bounded,  differentiable  function.  If  control  u  <}_  ('  js  used  the  expected  cost  is 

J(u)  -  Pu[c(Qr(x0))j. 

With  respect  to  P,  under  which  yt  is  a  Brownian  motion 


J(u) 


E\^0,T  (Io)c(C.r  (xu)):- 


A  control  u*  C  U_  is  optimal  if 


J(u* )  <  J(u) 


for  all  u  6  U_.  We  shall  suppose  there  is  an  optimal  control  u*  . 


•1 


3.  FLOWS. 


For  u  6  U_  and  i  6  Ii  consider  the  strong  solution 


C,z  (z)  =  x  +  J  /(r>£“,r(x)>ur)dr  +  J  g(r,tf  r(x))dwr. 


(3.1) 


We  wish  to  consider  the  behaviour  of  £“t(x)  for  each  trajectory  y  of  the  observation 
process.  In  fact  the  results  of  Bismut  (3)  and  Kunita  [6]  extend  and  show  the  map 


:  ir  -  >  R" 


is.  almost  surely,  a  diffeomorphism  for  each  y  C  C(l(),7’!,  R,n  ) . 

Write 

nr(*0)iit=  ic .,(xo)i- 

0<  s  <  f 

Then,  using  Gronwall’s  and  Jensen’s  inequalities,  for  any  p,  1  <  p  <  oo 

||r(x0)i|Pr  <c(l  +  |x0|p  +  f  g(r,Q  r{xQ))dwr 
v  Jo 

almost  surely,  for  some  constant  C. 

Using  A3  and  Burkholder’s  inequality 

|| £“  (x0)||r  6  Lv  for  I  <  p  <  oo. 

Suppose  u*  is  an  optimal  control,  and  write 


«;,,(•)  o.io- 


The  Jacobian  -y*—  is  the  matrix  solution  Ct  of  the  equation 


dct  =  fx  ( t ,  zitt  {x),u*)ct  dt  +  Y,  gi']  (t,  ca,t  (x))ct  dw ; . 


«=  i 


with  C.  —  I. 


(3.2) 


5 


Here  is  the  t'th  column  of  g  and  /  is  the  n  x  n  identity  matrix.  Writing  ||C||r  = 
suPo<s<f  |Cs|  and  using  Burkholder’s,  Jensen’s  and  Gronwall’s  inequalities  we  see  j|C|;r  •- 
Lv ,  1  <  p  <  oo. 

Consider  the  matrix  valued  process  D  defined  by 


Dt  =  I  -  J* DTfx{r,Ca<r{x),K)dr 

~22  [  f  Dr  {g^  (r,  t‘s<r  (i)))2  dr 


^  J 

i  J  5 

t=  1 


^ J, 

t=  l 


(3.3) 


Then  as  in  [5]  or  [6]  d(DtCt)  =  0  and  DaC,  —  I  so 


Dt  =Ct-‘ 


dx 


-  l 


Furthermore,  j|D||f  6  Lp ,  1  <  p  <  oo. 

Suppose  zt  =  zs  -f  At  +  i  It  H xdw'T  is  a  d-dimensiona!  semimartingale.  Bismut, 
[3 j  shows  one  can  consider  the  process  £*  t(z()  and  in  fact: 


C,J2<)  ~  ^  +  /  (/(r.C,r(-r),u;) 


.,  <K., 


dx 


H. 


i=  1 


t  =  2 


/t  QC' 

-~^-(zT)dAj  + 


dx 


t =  l 


(3-0 


DEFINITION  3. 1 .  For  s  e  [0,r],  h  >  0  such  that  0  <  s  <  s  +  h  <  r,  for  any  u  «  r,  and 
A  t  n  consider  a  ‘strong’  variation  u  of  u'  defined  by 


u(t,  w)  = 


u*(t,u/)  if  (t,  w)  (s,  s  +  h]  x  yt 
u(t,w)  if  (t,  to)  £  (5,  s  +  h]  x  ct. 
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THEOREM  3.2.  For  any  strong  variation  u  of  u*  consider  the  process 


-t  =  x 


(-r)  J  (/(»’,£,r(2r),ur)  -  /(r,  £  >r  (zr  )  ,  u‘r  ))dr. 


Then  the  process  ((zt )  is  indis£inguishab/e  from  ^^(z). 

PROOF'  We  shall  substitute  in  (3.4),  (noting  7/t  =  0  for  all  t).  Therefore, 


Z  +  J  f(r,Zl, r(-r),  K)dr 


t  ,dc' 

{  u  ^  t ,  r 


■))  (/(r-^.r(Zr),Wr)  ~  /  (r ,  C  >f  (2r  )  -  Ur  ) ) dr 


*  y(r-s.‘,r(-r  ))<lwr. 


solution  of  (3.1)  is  unique,  so  t{zt )  —  £“,(x).  ^otc  u(0  "  u’ (0  if  i  >  's  •'  h  so 


*r  —  ^ 


/,  if  £  >  s  +  /i  and 


q.,o)  =  o 


4.  THE  EXPONENTIAL  DENSITY 


Consider  the  ( d  f  l)-dimensional  system 


(j,l  M  =  I  +  ^  /(r.f».r(x)>“ri<,r  +  /  tf(r.  C,r(I))<i“'' 

=  ^p;,(L^M£;,rW)'4.  oi) 

That  is,  we  are  considering  an  augmented  flow  (£,Z)  in  i2rf+1  in  which  Z*  has  a  variable 


initial  condition  z  £  R.  Note: 


Z3\t(x,z)  =  ^*,(i). 
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The  map  (x,  z)  — ►  (£*  t  (x),Z*  t  (x,  z ))  is,  almost  surely,  a  diffeomorphism  of  Rd+ 1 


dtlt  df 

—  0,  =  0  and 

dz  ’  dz 


dg 


=  0. 


The  Jacobian  of  this  augmented  map  is  represented  by  the  matrix 

/£|Li  o  A 


az:,t 

3  x  d  z 


In  particular,  from  (4.1),  for  1  <  t  <  d 

az-'  ~  '•  f  sv 


y=i" 


dz: 


i=i 


a(k  ax,  +*'(CW)  K- 


We  are  interested  in  solutions  of  (4.1)  and  (4.2)  only  when  c  1 .  so  as  above  \s< 


Zs,i(x)  for  ('tc- 


Lemma  4.1. 

9Zs  t  (  dfa  r 

-at  =  z:MJ.  *■<«:. 

where,  as  in  (2.2),  dut  =  dyt  -  h(£‘  ((x))dt. 

PROOF  From  (4.2) 

BZlt  fl  (dZlr  ,  dZ  r  \ 

-'•(s'xW)  *  .  W 

r  <)r  * 


dx 


dz 


Write 


Then 


L»AX)  =  Zi,t(x){  /  h 


t  ■)  c  - 

'hj.r  , 

. - dr/ 

dx 


=  1  +  /  ^‘,r(x)/t'(s.‘,r(x))^yr 


and  the  product  rule  gives 


L*AX )  =  J'  Lttr(x)h'(Zlir(x))dyr 

,  r7.  {  w 

+  JS  z'^x)h*--*r 


dyr- 


.  Clearly, 


writ  e 
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The  minimum  cost  is 


J(u-)  =  £[Z0<-T(i0)«(ejfr(i0))| 

=  e\z-i,(Io)z;t(x)c(c,j(i))\. 


Also, 


J(u)  =  E\z;  Jx0)z:T(x)c{t;^T(x))\ 

=  E\z-Jx0)z;x{z^k)c(  e;iT(zJ+k))] 

by  (3.6)  and  (1.5).  Recall  Z‘  j,  (■)  and  c(g  ,.(-))  are  differentiable  almost  surely,  with 
continuous  and  uniformly  integrable  derivatives.  Consequently,  writing 

r(S,ar)  =  Z-iJ(Io)^T(2f){c;(t;i7.(--.))  '~~f  (a.) 


for  s  <  r  <  s  -{-  h,  we  have 


j(u)~j(u)  =  /i|z,;..,(x„)(z;i,(2,+i)c(qi,(2,+k))  -f;r(x)c(;:T(x)))i 


=  K 


Ml  h 


l'{s^r){f{r,CSir{-r),uT)  -  f  (r,  £  r  (x) ,  u'T  ) ). dr 


(Tl) 


Ihis  formula  describes  the  change  in  the  expected  cost  arising  from  the  perturbation  u  of 
the  optimal  control.  However,  J(u)  >  J (u* )  for  all  u  G  U  so  the  right  hand  side  of  (5.1) 
is  non-negative  for  all  h  >  0.  We  wish  to  divide  by  fi  >  0  and  let  h  -*  0.  This  requires 
some  careful  arguments  using  the  uniform  boundedness  of  the  random  variables  and  the 
monotone  class  theorem.  It  can  be  shown  that  there  is  a  set  5  C  [0,7'j  of  zero  Lebesgue 
measure  such  that  if  s  ^  S’ 


E\V(s,  *)(/(*,  (*o),  «)  -  /K  ^o,  (zoUMa  1  >  0 


(5-2) 
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for  any  u  6  U  and  A  €  Ya . 


Details  of  this  argument  can  be  found  in  (l|.  Define 


dc‘ 

Ps  (x)  —  E  (£o,r  (xo))  qx  (x) 

rT  A  £* 

+  c(Q,t{x0))(  J  k,(QJxa))  -'f  (x)Av  )  | 

where  x  =  £*  (x0)  and  E‘  is  the  expectation  under  P*  —  Pu  . 

In  (5.2)  we  have  established  the  following: 


THEOREM  5.1.  p,,(x)  is  the  adjoint  process  for  the  partially  observed  optimal  control 

problem.  That  is,  if  u’  6  U  is  optimal  there  is  a  set  S  C  [0,7’)  of  zero  Lebcsguc  m e.isnro 
such  that  for  s  y  S 


E,\p3{x)f{s,x,u‘)  |  Ya  \  >  r|pj(i)/(.s,x,u)  |  n]  a.s.  (5.3) 

so  the  optimal  control  u*  almost  surely  minimizes  the  conditional  Hamiltonian. 

If  x  =  (xq  )  has  a  conditional  density  q,  (x)  under  P‘,  and  if  /  is  differentiable 

in  u,  (5.3)  implies 


^(u,(s)  -  u*(s))  J  (  r(s,i)  (s,x,u‘)q3{x)dx 

l—l 


>  0. 


This  is  the  result  of  Bensoussan  [2]. 
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